| Only patch if you really need to and only if you understand exactly what the patch does. | 
sendmail -f
        correction by David PhillipsI'm a qmail enthusiast. I've even written a book about it (buy a copy!). One of the things that any Qmail administrator ought to know is that (for a variety of historical reasons) Qmail has a lot of available patches. There are good reasons why you might need each patch, but in general you should try to use as few as possible. It is the general consensus that the more patches you use, the more likely there are to be bugs that crop up and conflicts between patches. The general rule for patching qmail is highlighted across the top of this page. Ignore this rule it at your peril.
That said, I have found a collection of patches that work for me. I have installed them for reasons I explain below, which are not necessarily reasons that matter to everyone. This is a list of the patches I use (and some others I have run across), with descriptions and reasons for using them. Most of these patches were pulled from qmail.org, which has unfortunately vanished.
Some people feel that qmail has certain shortcomings (like non-conformance to RFCs) and either like complaining or have found a solution, or have developed patches to fix the problem. Trust me, the issues have been hashed over again and again and again on the qmail mailing list, and the current state of things seems to keep the most people happy. In most cases, things are the way they are on purpose (please feel free to search the qmail list archives for the explanation of any particular detail!).
These patches all work on netqmail, which you should be using anyway. While vanilla qmail is as cool and unbreachable as ever, netqmail is a convenient packaging of some of the patches that have cropped up as being very important. It is not officially the same thing as qmail, but is a convenience packaging of qmail. For more information, go here.
One final note, some of these patches conflict (or seem to), and resolving them takes a little bit of knowledge of C. If and when I get a chance, I'll put up an über patch collecting and reconciling the patches that I use. For any others, you're on your own.
Apply these patches with the following commands:
    cd /path/to/netqmail/
    patch -p1 < /path/to/patch
  
  Once upon a time, back in 1996, there was a really unfortunate bug in the most popular DNS server software (BIND 4.9.3): it did not respond correctly to "CNAME" requests (that is to say, requests for any CNAME data about a particular domain name). This is critical information that an email server needs to know to do its job. Thankfully, there was a way to work around the problem: "ANY" requests. These requests ask the DNS server, essentially, for any and ALL information it has about the domain name in question, including CNAME information.
These ANY queries have two big problems:
This patch reverts that workaround, and uses only CNAME requests instead of ANY requests. Even for big domains that use a CNAME redirection, this answer is tiny.
Technically, using this patch risks being unable to deliver mail to anyone using such an ancient version of BIND. Back in 2002, that was less than 2% of all DNS servers. Today? It's hard to imagine anyone still does.
user@remoteaddress@thismailserver, for example, or
    user%remoteaddress@thismailserver (aka "the
    percent hack"). Unfortunately, qmail doesn't reject all of
    these attempts out of hand, but instead accepts them and
    generates a bounce message. This behavior is technically valid,
    but is unwise: it can be used to create bounce-spam. It's
    important to state: this does NOT make qmail a relay... but it
    can be used as a bounce-spam source (though the content
    of the bounce is not entirely dictated by the sender). Some
    automated relay testing software assumes that if the message is
    accepted then it will be delivered/relayed instead of bounced
    (or black-holed), and as a result will provide an inaccurate
    diagnosis: that your server is an open-relay. Such relay
    testers are incorrect. However, as a result, you may get
    onto one or two blacklists that are based on such relay checks,
    even if you delete such email messages. To avoid this hassle,
    use this patch, written by Russel Nelson, to reject such relay
    attempts. This patch precludes using the percent hack, but you
    shouldn't be allowing that anyway so it's no big loss.
    (local copy) (qmail.org)make
    cert before running make setup check).
    Also, you must create a cron job to rebuild the certs daily
    (because otherwise, over time, an attacker could figure out
    what they are). Commonly, when someone indicates that they want
    qmail to support SSL/STARTTLS they will be referred to a
    project like mailfront. While
    mailfront is a worthy project, it doesn't solve the entire
    problem. Specifically, it doesn't enable qmail to use SSL for
    sending mail to other servers that support STARTTLS
    (this is a problem of privacy; but keep in mind that if the
    email is being relayed, it may be transmitted via an
    unencrypted communication later—if you're really worried, use
    PGP). This patch, however, does enable qmail to do that.
    (local copy)
    (inoa.net)More recently, Amitai Schlier pulled together a selection of useful tools for doing recipient validation using a variety of criteria. You can see his work here.
Using such a patch, it is trivial to implement things like three-tuple greylisting (based on RECIPIENT, SENDER, and TCPREMOTEIP). You can also, as Soffian suggests, use a script that queries another server to see if the recipient is valid. I like this script in part because I can use different verification techniques depending on the domain (for example, I can do one kind of check for lists.memoryhole.net, and another for memoryhole.net itself). I often hear questions on the qmail mailing list that could be solved simply and easily with this (relatively trivial) patch, and every time, I am re-impressed with the power and flexibility that this patch provides. It doesn't get enough credit.
Here's how it works: when the environment variable RCPTCHECK is set, qmail-smtpd will execute the program specified in that variable. Before the program is executed, the recipient in question is stored in the RECIPIENT environment variable, and the sender is stored in the SENDER environment variable. The exit code of the specified program determines whether qmail views the recipient as valid or not. Possible exit codes include:
A trivial example script would be something like this:
#!/bin/sh
GoodRecipient=0
BadRecipient=100
Error=120
if grep "$RECIPIENT" /var/qmail/control/goodrcptto >/dev/null; then
        exit $GoodRecipient
else
        exit $BadRecipient
fi 
          Here's a slightly more complex example:
#!/bin/sh
GoodRecipient=0
BadRecipient=100
Error=120
User=$( echo "$RECIPIENT" | cut -d@ -f1 )
Domain=$( echo "$RECIPIENT" | cut -d@ -f2- )
if ! type id >/dev/null 2>&1 ; then
        # id program not in PATH
        exit $Error
fi
if id "$User" >/dev/null 2>&1 ; then
        exit $GoodRecipient
else
        exit $BadRecipient
fi
          A basic example of how to do greylisting with this patch is here (txt) (based on Kelly French's script). Note that you'd want to combine that script with some other kind of recipient validation as well, and needs a cron job to clean up after itself.
Integrating that script into your qmail setup is simple: rename the qmail-remote program to qmail-remote.orig and put that script in as qmail-remote (make sure it's readable and executable by everyone—note that that is different from the original qmail-remote permissions). The script can use two programs to do its job: the dktest program that comes with libdomainkeys (obsolete) and dkimsign.pl that comes with Perl's Mail::DKIM module. (That script expects that dkimsign.pl accepts the --key argument; if yours does not, you can use this simple patch (txt) to modify your dkimsign.pl so that it does accept the --key argument, or download the copy below.)
If you're interested in verifying both DKIM and DomainKey signatures, a similar script that can be used in much the same way as Russ Nelson's program is here (txt). This script relies on dktest as well, but also requires a script called dkimverify.pl. A similarly named script comes with Mail::DKIM, but is not particularly useful; I wrote one that generates some useful headers, which is available below.
Generic copies of those scripts can be had here: dkimsign.pl (txt) and dkimverify.pl (txt)
todo") and the delivery side. In vanilla qmail,
    the program that spawns email delivery agents to empty the
    delivery side of the queue is the same program that drains the
    ingestion side of the queue. It must recognize that new email
    has been added to the mail ingestion queue, parse it, generate
    the necessary metadata, and places the message into the mail
    delivery queue. When email comes into the server
    extremely quickly, qmail can sometimes spend so much
    time draining the ingestion queue that deliveries don't get
    scheduled and the email starts accumulating in the "ingest"
    side of the mail queue. If this is sustained, most of your
    email queue may be in an undeliverable state in the
    todo queue rather than the delivery queue (note
    that this is only a problem when you get massive amounts of
    email at the same time (many per second; the threshold is
    system-specific)). This behavior is often referred to as "silly
    qmail syndrome." André Opperman wrote a patch to solve the
    problem. It creates a separate qmail-todo program,
    whose only job is processing the ingestion/todo queue. This
    allows draining the todo queue to happen asynchronously and
    thus does not prevent deliveries from ocurring at the same
    time. The code of the solution is a bit complicated, and is not
    worth applying unless you are experiencing "silly qmail
    syndrome." This patch is referred to as the ext_todo
    patch. (local copy)
    (nrg4u.com)todo queue, an
    inefficient filesystem that has trouble with large directories
    may restrict the speed of todo processing significantly. Such
    filesystems were standard back in 1998, and qmail had to work
    around this problem for the majority of the mail queue. It did
    so by creating many sub-directories (in essence, hash buckets),
    to limit the number of messages that would likely be in any one
    directory. For whatever reason, this workaround wasn't applied
    to the todo queue. The best way to address this is to
    use a filesystem capable of handling directories with large
    numbers of files. Ext3, for instance, uses a hash-based data
    structure to implement directories, which handles large numbers
    of files in a single directory with ease. But
    upgrading/modernizing your filesystem is not always possible,
    depending on your situation. Russ Nelson wrote a patch to make
    qmail use the same multi-directory hashing system qmail uses in
    its main queue in the todo portion of the queue as well. This
    patch is referred to as the "big-todo" patch. There's no reason
    to apply this patch unless you really really need it,
    because getting a better filesystem (or enabling the right
    features on the filesystem you're already using) is a more
    efficient option (though using the multi-directory scheme won't
    really cause any *trouble* on newer filesystems either); this
    patch can require that your queue be rebuilt, and so applying
    it to a running system can be a bit of a pain. (local copy) (qmail.org)
40x errors mean, in essence, "try again later". We can quibble over whether that's the smartest thing to do, but as-such, qmail's behavior is consistent with the RFC. The usual complaint is to point out that RFC 2821 says (in section 5):
Qmail is able, and does so if the lowest-preference MX is unreachable. The RFC does not specify the relationship between failure type and MX choice (for example, the language it uses in section 4.2.1 suggests that retrying within the same connection is acceptable), so again there, qmail is consistent with the RFC's stricture. I think there's a real point to be made out of the fact that the RFC authors require that the client must be able to try other MX entries, rather than saying that the client must always try other MX entries. It leaves plenty of room for choosing different retry policies based on the type of error.The policy at issue here, however, is regarding errors given as part of the greeting. When qmail connects to a mail server and is greeted with an error (instead of "
250 Hi there!"), how should that be interpreted, and how should it be handled? What should it mean for that particular message you're trying to deliver? (This is (primarily) what Matthias Andree's patch changes.)Some (vocal) mail administrators seem to be of the opinion that a greeting error is a resonable thing to use to indicate an overload situation (for example, that the server is overwhelmed by a spam attack, and cannot handle additional email at the moment). But consider: is this a reasonable thing to do? If a server cannot (or will not) accept email, and this is known at connection time, why accept the connection? It wastes bandwidth, it wastes server resources, it wastes time. Why would anyone use scarce resources—in the middle of being overloaded—to tell senders about it? Why accept a connection when you cannot accept email? It's more efficient to simply refuse the connection. Imagine if taxicabs worked on the same principle. When they're hired and full, they cannot accept new riders. The easiest and most direct way of not accepting new riders is to ignore the folks on the sidewalk waving at the taxi. The idea that the taxi would pull over to tell them "sorry, I'm busy" seems downright goofy (almost as if the taxi driver is taunting the people on the sidewalk). Similarly, if a server is overloaded, the idea that it would accept connections for the sole purpose of telling the sender "sorry, I'm busy" also seems goofy. If you're busy, you should be using your resources to do your job rather than using them to tell everyone how terribly busy you are. So it seems reasonable to conclude that if a server is willing to accept new connections, then it's probably not overloaded.
However, the more important issue is that when a server KNOWS it cannot accept email for whatever reason, what should it do? First it needs to decide if it wants that email delivery attempt be retried immediately or later? And, in either case, should the sender re-contact the most-preferable MX (i.e. the one that is currently overloaded) or a less-preferable MX when it retries? And once the answers to those questions are determined, what is the correct (and/or most reliable) means of expressing that intent?
Answering the latter question requires answering another question first: what do backup MX records mean? Overload situations (or, more typically, spammers) are NOT the reason to have secondary/backup MX servers (with higher MX priorities). If you have multiple servers available to deal with high load, why not assign them all the same priority and use them ALL during low-load situations as well (e.g. to decrease latency)? Waiting for one server to become overloaded before involving another (that you already had available) is a lousy management policy because it leads to wasted resources (read: wasted money) during normal-load situations (i.e. most of the time). Additionally, overload is particularly undesirable because it leads to slow response time; it's better to avoid overload completely—if you can—by using all the resources at your disposal. Using backup servers to handle overload is not impossible, obviously, because some people do it, but it is not a smart use of resources.
So, what IS the reason to have secondary MX servers? Since SMTP-compliant senders are already required to queue undeliverable messages and retry later, the primary benefit of a backup MX is to reduce latency in catastrophic situations. For example, you may have an arrangement where you can tell the backup MX to deliver all of the messages it's holding for you in a single block immediately after your primary email system comes back online. That way the messages get delivered as soon as you're ready, rather than waiting for all the myriad of senders to realize that you're back online and retry—depending on their retry schedules, that could take hours. So what would you use as a backup MX? If it's just another server you have in the machine room, there's no reason not to use it as one of your primary mail servers... To reiterate, there has to be a reason why the backup is less preferable and is not part of your primary mail system. By knowing that, we can know what sort of penalty is associated with using a backup MX, and how much effort should be put into avoiding paying that penalty.
In my opinion, using the MX preference ordering system as an overload failover system is a bad setup. The best way to handle overload is to avoid ever being overloaded, rather than by arranging a failover. Additionally, it seems wiser to assume that the admin is smart and isn't using the MX ordering as a bad overload compensation technique. We ought to assume that the admin is using the MX preference for a good reason (such as "the backup is an arrangement we have with another company in case of emergencies") rather than a bad reason (such as "the admin doesn't know what he's doing"). Thus, it is reasonable to avoid using the backup MX records unless the primary cannot be contacted at all.
Now then, if one of the primary MX servers cannot accept additional email and intends for deliveries to be retried, what is the most effective and efficient means of ensuring that deliveries are retried to the next-most-preferable MX record? Simple: refuse to accept the connection. Accepting the connection just to emit an error message is not only wasteful, but unreliable. At best, it depends on a particular interpretation of a "proposed standard" revision of the email RFC that not everyone agrees on.
For what it's worth, qmail always tries to get the most-preferable MX. Unreachable IPs, however, get added to a 1-hour do-not-try list (see qmail-tcpto man page; it's slightly more complicated, but not much). If the lowest MX was unreachable, qmail won't retry that unreachable IP for an hour, and so will retry the higher MXs. After the one hour timeout, qmail will allow itself to retry the lower MX to see if it came back. Since the standard retry schedule has two retries in less than an hour (one 6 minutes after the first, and the next 26 minutes after the first), if the lowest MX was unreachable, the next two retries will be to the higher MX, but the fourth delivery attempt will start with the lowest MX again.
Hide Comment